Q-learning: Flexible learning about useful utilities

نویسندگان

Erica E. M. Moodie

Nema Dean

Yue Ru Sun

چکیده

Dynamic treatment regimes are fast becoming an important part of medicine, with the corresponding change in emphasis from treatment of the disease to treatment of the individual patient. Because of the limited number of trials to evaluate personally tailored treatment sequences, inferring optimal treatment regimes from observational data has increased importance. Q-learning is a popular method for estimating the optimal treatment regime, originally in randomized trials but more recently also in observational data. Previous applications of Q-learning have largely been restricted to continuous utility end-points with linear relationships. This paper is the first both to extend the framework to discrete utilities and to implement the modelling of covariates from linear to more flexible modelling using the generalized additive model (GAM) framework. Simulated data results show that the GAM adapted Q-learning typically outperforms Q-learning with linear models and other frequently-used methods based on propensity scores in terms of coverage and bias/MSE. This represents a promising step towards a more fully general Q-learning approach to estimating optimal dynamic treatment regimes. This article is in technical report form, the final publication is available at http://www.springerlink.com/openurl.asp?genre=article&id=doi:10.1007/s12561-0139103-z .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Online Q-learning Based Multi-Agent LFC for a Multi-Area Multi-Source Power System Including Distributed Energy Resources

This paper presents an online two-stage Q-learning based multi-agent (MA) controller for load frequency control (LFC) in an interconnected multi-area multi-source power system integrated with distributed energy resources (DERs). The proposed control strategy consists of two stages. The first stage is employed a PID controller which its parameters are designed using sine cosine optimization (SCO...

متن کامل

Evolving subjective utilities: Prisoner's Dilemma game examples

We have proposed the utility-based Q-learning concept that supposes an agent internally has an emotional mechanism that derives subjective utilities from objective rewards and the agent uses the utilities as rewards of Q-learning. We have also proposed such an emotional mechanism that facilitates cooperative actions in Prisoner’s Dilemma (PD) games. However, this mechanism has been designed and...

متن کامل

A Q-learning Based Continuous Tuning of Fuzzy Wall Tracking

A simple easy to implement algorithm is proposed to address wall tracking task of an autonomous robot. The robot should navigate in unknown environments, find the nearest wall, and track it solely based on locally sensed data. The proposed method benefits from coupling fuzzy logic and Q-learning to meet requirements of autonomous navigations. Fuzzy if-then rules provide a reliable decision maki...

متن کامل

Nursing students’ attitude about factors influencing clinical learning in Medical ‎University of Guilan

Introduction: Education is a regular process in order to aid individuals for acquiring &lrm;knowledge and new skills. Education is an active interaction between educator and learner. &lrm;Learning is a stable change process in individual’s potential behavior. Therefore, we can only &lrm;say that the students’ learning is satisfactory when learning causes proper behavioral changes &l...

متن کامل

Dynamic Joint Action Perception for Q-Learning Agents

Q-Iearning is a reinforcement learning alg()rithm that learns expected utilities for stateaction transitions through successive interactions with the environment The algorithm '5 simplicity as well as its convergence properties have made it a popular algorithm for study However; its non-parametric representation of utilities limits its effectiveness in environments with large amounts of percept...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Q-learning: Flexible learning about useful utilities

نویسندگان

چکیده

منابع مشابه

An Online Q-learning Based Multi-Agent LFC for a Multi-Area Multi-Source Power System Including Distributed Energy Resources

Evolving subjective utilities: Prisoner's Dilemma game examples

A Q-learning Based Continuous Tuning of Fuzzy Wall Tracking

Nursing students’ attitude about factors influencing clinical learning in Medical ‎University of Guilan

Dynamic Joint Action Perception for Q-Learning Agents

عنوان ژورنال:

اشتراک گذاری